592 research outputs found

    Ward's Hierarchical Clustering Method: Clustering Criterion and Agglomerative Algorithm

    Full text link
    The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. However there are different interpretations in the literature and there are different implementations of the Ward agglomerative algorithm in commonly used software systems, including differing expressions of the agglomerative criterion. Our survey work and case studies will be useful for all those involved in developing software for data analysis using Ward's hierarchical clustering method.Comment: 20 pages, 21 citations, 4 figure

    DHODH modulates transcriptional elongation in the neural crest and melanoma

    Get PDF
    Melanoma is a tumour of transformed melanocytes, which are originally derived from the embryonic neural crest. It is unknown to what extent the programs that regulate neural crest development interact with mutations in the BRAF oncogene, which is the most commonly mutated gene in human melanoma1. We have used zebrafish embryos to identify the initiating transcriptional events that occur on activation of human BRAF(V600E) (which encodes an amino acid substitution mutant of BRAF) in the neural crest lineage. Zebrafish embryos that are transgenic for mitfa:BRAF(V600E) and lack p53 (also known as tp53) have a gene signature that is enriched for markers of multipotent neural crest cells, and neural crest progenitors from these embryos fail to terminally differentiate. To determine whether these early transcriptional events are important for melanoma pathogenesis, we performed a chemical genetic screen to identify small-molecule suppressors of the neural crest lineage, which were then tested for their effects on melanoma. One class of compound, inhibitors of dihydroorotate dehydrogenase (DHODH), for example leflunomide, led to an almost complete abrogation of neural crest development in zebrafish and to a reduction in the self-renewal of mammalian neural crest stem cells. Leflunomide exerts these effects by inhibiting the transcriptional elongation of genes that are required for neural crest development and melanoma growth. When used alone or in combination with a specific inhibitor of the BRAF(V600E) oncogene, DHODH inhibition led to a marked decrease in melanoma growth both in vitro and in mouse xenograft studies. Taken together, these studies highlight developmental pathways in neural crest cells that have a direct bearing on melanoma formation

    Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm

    Full text link
    Over the past five decades, k-means has become the clustering algorithm of choice in many application domains primarily due to its simplicity, time/space efficiency, and invariance to the ordering of the data points. Unfortunately, the algorithm's sensitivity to the initial selection of the cluster centers remains to be its most serious drawback. Numerous initialization methods have been proposed to address this drawback. Many of these methods, however, have time complexity superlinear in the number of data points, which makes them impractical for large data sets. On the other hand, linear methods are often random and/or sensitive to the ordering of the data points. These methods are generally unreliable in that the quality of their results is unpredictable. Therefore, it is common practice to perform multiple runs of such methods and take the output of the run that produces the best results. Such a practice, however, greatly increases the computational requirements of the otherwise highly efficient k-means algorithm. In this chapter, we investigate the empirical performance of six linear, deterministic (non-random), and order-invariant k-means initialization methods on a large and diverse collection of data sets from the UCI Machine Learning Repository. The results demonstrate that two relatively unknown hierarchical initialization methods due to Su and Dy outperform the remaining four methods with respect to two objective effectiveness criteria. In addition, a recent method due to Erisoglu et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms (Springer, 2014). arXiv admin note: substantial text overlap with arXiv:1304.7465, arXiv:1209.196

    Commonality Preserving Multiple Instance Clustering Based on Diverse Density

    Full text link
    Abstract. Image-set clustering is a problem decomposing a given im-age set into disjoint subsets satisfying specied criteria. For single vector image representations, proximity or similarity criterion is widely applied, i.e., proximal or similar images form a cluster. Recent trend of the im-age description, however, is the local feature based, i.e., an image is described by multiple local features, e.g., SIFT, SURF, and so on. In this description, which criterion should be employed for the clustering? As an answer to this question, this paper presents an image-set clus-tering method based on commonality, that is, images preserving strong commonality (coherent local features) form a cluster. In this criterion, image variations that do not affect common features are harmless. In the case of face images, hair-style changes and partial occlusions by glasses may not affect the cluster formation. We dened four commonality mea-sures based on Diverse Density, that are used in agglomerative clustering. Through comparative experiments, we conrmed that two of our meth-ods perform better than other methods examined in the experiments.

    Molecular biology of breast cancer metastasis: Inflammatory breast cancer: clinical syndrome and molecular determinants

    Get PDF
    Inflammatory breast cancer (IBC) is an aggressive form of locally advanced breast cancer (LABC) that effects approximately 5% of women with breast cancer annually in the USA. It is a clinically and pathologically distinct form of LABC that is particularly fast growing, invasive, and angiogenic. Nearly all women have lymph node involvement at the time of diagnosis, and approximately 36% have gross distant metastases. Despite recent advances in multimodality treatments, the prognosis of patients with IBC is poor, with a median disease-free survival of less than 2.5 years. Recent work on the genetic determinants that underlie the IBC phenotype has led to the identification of genes that are involved in the development and progression of this disease. This work has been aided by the establishment of primary human cell lines and animal models. These advances suggest novel targets for future interventions in the diagnosis and treatment of IBC

    Differential survival following trastuzumab treatment based on quantitative HER2 expression and HER2 homodimers in a clinic-based cohort of patients with metastatic breast cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We have recently described the correlation between quantitative measures of HER2 expression or HER2 homodimers by the HERmark assay and objective response (RR), time-to progression (TTP), and overall survival (OS) in an expanded access cohort of trastuzumab-treated HER2-positive patients with metastatic breast cancer (MBC) who were stringently selected by fluorescence in situ hybridization (FISH). Multivariate analyses suggested a continuum of HER2 expression that correlated with outcome following trastuzumab. Here we investigate the relationship between HER2 expression or HER2 homodimers and OS in a clinic-based population of patients with MBC selected primarily by IHC.</p> <p>Methods</p> <p>HERmark, a proximity-based assay designed to detect and quantitate protein expression and dimerization in formalin-fixed paraffin-embedded (FFPE) tissues, was used to measure HER2 expression and HER2 homodimers in FFPE samples from patients with MBC. Assay results were correlated with OS using univariate Kaplan-Meier, hazard function plots, and multivariate Cox regression analyses.</p> <p>Results</p> <p>Initial analyses revealed a parabolic relationship between continuous measures of HER2 expression and risk of death, suggesting that the assumption of linearity for the HER2 expression measurements may be inappropriate in subsequent multivariate analyses. Cox regression analyses using the categorized variable of HER2 expression level demonstrated that higher HER2 levels predicted better survival outcomes following trastuzumab treatment in the high HER2-expressing group.</p> <p>Conclusions</p> <p>These data suggest that the quantitative amount of HER2 expression measured by Hermark may be a new useful marker to identify a more relevant target population for trastuzumab treatment in patients with MBC.</p

    Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV

    Get PDF
    The performance of muon reconstruction, identification, and triggering in CMS has been studied using 40 inverse picobarns of data collected in pp collisions at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection criteria covering a wide range of physics analysis needs have been examined. For all considered selections, the efficiency to reconstruct and identify a muon with a transverse momentum pT larger than a few GeV is above 95% over the whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4, while the probability to misidentify a hadron as a muon is well below 1%. The efficiency to trigger on single muons with pT above a few GeV is higher than 90% over the full eta range, and typically substantially better. The overall momentum scale is measured to a precision of 0.2% with muons from Z decays. The transverse momentum resolution varies from 1% to 6% depending on pseudorapidity for muons with pT below 100 GeV and, using cosmic rays, it is shown to be better than 10% in the central region up to pT = 1 TeV. Observed distributions of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO

    Measurement of the Forward-Backward Asymmetry in the B -> K(*) mu+ mu- Decay and First Observation of the Bs -> phi mu+ mu- Decay

    Get PDF
    We reconstruct the rare decays B+→K+ÎŒ+Ό−B^+ \to K^+\mu^+\mu^-, B0→K∗(892)0ÎŒ+Ό−B^0 \to K^{*}(892)^0\mu^+\mu^-, and Bs0→ϕ(1020)ÎŒ+Ό−B^0_s \to \phi(1020)\mu^+\mu^- in a data sample corresponding to 4.4fb−14.4 {\rm fb^{-1}} collected in ppˉp\bar{p} collisions at s=1.96TeV\sqrt{s}=1.96 {\rm TeV} by the CDF II detector at the Fermilab Tevatron Collider. Using 121±16121 \pm 16 B+→K+ÎŒ+Ό−B^+ \to K^+\mu^+\mu^- and 101±12101 \pm 12 B0→K∗0ÎŒ+Ό−B^0 \to K^{*0}\mu^+\mu^- decays we report the branching ratios. In addition, we report the measurement of the differential branching ratio and the muon forward-backward asymmetry in the B+B^+ and B0B^0 decay modes, and the K∗0K^{*0} longitudinal polarization in the B0B^0 decay mode with respect to the squared dimuon mass. These are consistent with the theoretical prediction from the standard model, and most recent determinations from other experiments and of comparable accuracy. We also report the first observation of the Bs0→ϕΌ+Ό−decayandmeasureitsbranchingratioB^0_s \to \phi\mu^+\mu^- decay and measure its branching ratio {\mathcal{B}}(B^0_s \to \phi\mu^+\mu^-) = [1.44 \pm 0.33 \pm 0.46] \times 10^{-6}using using 27 \pm 6signalevents.Thisiscurrentlythemostrare signal events. This is currently the most rare B^0_s$ decay observed.Comment: 7 pages, 2 figures, 3 tables. Submitted to Phys. Rev. Let

    Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV

    Get PDF
    The performance of muon reconstruction, identification, and triggering in CMS has been studied using 40 inverse picobarns of data collected in pp collisions at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection criteria covering a wide range of physics analysis needs have been examined. For all considered selections, the efficiency to reconstruct and identify a muon with a transverse momentum pT larger than a few GeV is above 95% over the whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4, while the probability to misidentify a hadron as a muon is well below 1%. The efficiency to trigger on single muons with pT above a few GeV is higher than 90% over the full eta range, and typically substantially better. The overall momentum scale is measured to a precision of 0.2% with muons from Z decays. The transverse momentum resolution varies from 1% to 6% depending on pseudorapidity for muons with pT below 100 GeV and, using cosmic rays, it is shown to be better than 10% in the central region up to pT = 1 TeV. Observed distributions of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO

    Search for a New Heavy Gauge Boson Wprime with Electron + missing ET Event Signature in ppbar collisions at sqrt(s)=1.96 TeV

    Get PDF
    We present a search for a new heavy charged vector boson Wâ€ČW^\prime decaying to an electron-neutrino pair in ppˉp\bar{p} collisions at a center-of-mass energy of 1.96\unit{TeV}. The data were collected with the CDF II detector and correspond to an integrated luminosity of 5.3\unit{fb}^{-1}. No significant excess above the standard model expectation is observed and we set upper limits on σ⋅B(Wâ€Č→eÎœ)\sigma\cdot{\cal B}(W^\prime\to e\nu). Assuming standard model couplings to fermions and the neutrino from the Wâ€ČW^\prime boson decay to be light, we exclude a Wâ€ČW^\prime boson with mass less than 1.12\unit{TeV/}c^2 at the 95\unit{%} confidence level.Comment: 7 pages, 2 figures Submitted to PR
    • 

    corecore